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Abstract 

Neural networks are ideally suited to describe the spatial and temporal dependence 
of tracer-tracer correlations. The neural network performs well even in regions where 
the correlations are less compact and normally a family of correlation curves would be 
s required. For example, the CH 4 -N 2 0 correlation can be well described using a neural 
network trained with the latitude, pressure, time of year, and CH 4 volume mixing ratio 
(v.m.r.). In this study a neural network using Quickprop learning and one hidden layer 
with eight nodes was able to reproduce the CH 4 -N 2 0 correlation with a correlation co- 
efficient of 0.9995. Such an accurate representation of tracer-tracer correlations allows 
io more use to be made of long-term datasets to constrain chemical models. Such as the 
dataset from the Halogen Occultation Experiment (HALOE) which has continuously 
observed CH 4 (but not N 2 0) from 1991 till the present. The neural network Fortran 
code used is available for download. 

1 Introduction 

is The spatial distributions of atmospheric trace constituents are in general dependent on 
both chemistry and transport. Compact correlations between long-lived species are 
well-observed features in the middle atmosphere, as for example described by Fahey 
et al. (1989); Plumb and Ko (1992); Loewenstein et al. (1993); Elkins et al. (1996); 
Keim et al. (1997); Michelson et al. (1998); Rinsland et al. (1999); Strahan (1999); 
20 Fischer et al. (2000); Muscari et al. (2003). The correlations exist for ail long-lived 
tracers - not just those which are chemically related. The tight relationships between 
different constituents have led to many analyses where measurements of one tracer 
are used to infer the abundance of another tracer. These correlations can also be used 
as a diagnostic of mixing (Schoeberl et al., 1997; Morgenstern et al., 2002) and to 
as distinguish between air-parcels of different origins (Waugh and Funatsu, 2003). 

Of special interest are the so-called “long-lived tracers”: constituents such as ni- 
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trous oxide (N 2 0), methane (CH 4 ), and the chiorofluorocarbons (CFCs) that have long 
lifetimes (many years) in the troposphere and lower stratosphere, but are destroyed 
rapidly in the middle and upper stratosphere. 

The correlations are spatially and temporally dependent. For example, there is 
s a “compact-relation" regime in the lower part of the stratosphere and an “altitude- 
dependent” regime above this. In the compact-relation region, the abundance of one 
tracer is uniquely determined by the value of the other tracer, without regard to other 
variables such as latitude or altitude. In the altitude-dependent regime, the correlation 
generally shows significant variation with altitude (Minschwaner et al., 1996) (Fig. Id). 
jo The description of such spatially and temporally dependent correlations are usually 
achieved by a family of correlations. However, a single neural network is a natural and 
effective alternative. 

2 Motivation 

The motivation for this study was preparation for a long term chemical assimilation 
is of Upper Atmosphere Research Satellite (UARS) (Reber et al., 1993) data starting 
in 1991 and coming up to the present. For this period we have continuous version 
19 data from the Halogen Occultation Experiment (HALOE) (Russell et al., 1993) but 
not observations of N 2 0 as both ISAMS and CLAES failed. In addition we would 
like to constrain the total amount of reactive nitrogen, chlorine, and bromine in a self- 
20 consistent way. Tracer correlations provide a means to do this by using HALOE CH 4 
observations. 

3 Neural Networks 

Computational neural networks are composed of simple elements operating in parallel. 
These elements are inspired by biological nervous systems. As in nature, the net- 
25 work function is determined largely by the connections between elements. A neural 
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network can be trained to perform a particular function by adjusting the values of the 
connections (weights) between elements (Fig. 1b). 

Commonly neural networks are trained so that a particular input leads to a specific 
target output. The network is adjusted, based on a comparison of the output and the 
5 target, until the network output matches the target. Typically many such input/target 
pairs are used, in this supervised learning, to train a network. Batch training of a 
network proceeds by making weight and bias changes based on an entire set (batch) 
of input vectors. Incremental training changes the weights and biases of a network 
as needed after presentation of each individual input vector. Incremental training is 
10 sometimes referred to as “on line” or “adaptive” training. 

Neural networks have been trained to perform complex functions in various fields 
of application including pattern recognition, identification, classification, speech, vision 
and control systems. It is well established that multilayer feedforward networks are 
universal approximators (Hornik et al., 1989; Castro and Delgado, 1996; Ying, 1998). 

15 In this study we use neural networks (Peterson et al., 1 994) to describe the temporal 
and spatial dependence of tracer correlations (Fig. 1). 

To find the optimum neural network configuration a range of network architectures 
were considered containing between one and two hidden layers with between one and 
sixteen nodes in each hidden layer. A range of updating procedures were also used 
20 including back- propagation, Manhattan learning, Langevin Learning, Quickprop and 
Rprop. Each network was trained for 10 6 epochs. The details of the different learn- 
ing methods can be found in (Peterson et al., 1994). A variety of activation functions 
were used. Non-linear activation functions performed best, and the most successful is 
shown below in Eq. (1). To determine which network architecture and updating proce- 
25 dure was most suitable each configuration was tried in turn and the correlation coeffi- 
cient between the actual solution and the neural network solution were computed (the 
correlation coefficient being a normalized measure of the linear relationship strength 
between variables). The configuration with the highest correlation coefficient was cho- 
sen. 
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3. 1 The CH 4 -N 2 0 Correlation 


Fig. la shows the CH 4 -N 2 0 correlation from the Cambridge 2D model (Law and Pyle, 
1993a, b) overlaid with a neural network fit to the correlation. The neural network used 
was a feed-forward multilayer perceptron type with Quickprop learning (Peterson et a!., 
1994). There were four inputs, one output, and one hidden layer with eight nodes. A 
non-linear activation function was used, namely 


0M = 


1 

1 + exp(-2x) 


( 1 ) 


The training dataset contained 1292 patterns, sampling the input space completely as 
shown in Fig. 1. The network was constrained for 10 6 epochs (iterations). 

The correlation coefficient between the actual solution and the neural network solu- 
tion was 0.9995. Figure 1 panel (b) shows how the median fractional error of the neural 
network decreases with epoch (iteration). Both CH 4 and pressure are strongly corre- 
lated with N 2 0 as can be seen in panels (c) and (d). Latitude and time are only weakly 
correlated with N 2 0 as can be seen in panels (e) and (f). Even though the correlation 
with time of year and latitude is relatively weak it still does play a role in capturing some 
of the details of the CH 4 -N 2 0 correlation in Panel (a). 

A polynomial or other fit will typically do a good job of describing the CH 4 -N 2 0 cor- 
relation for high values of CH 4 and N 2 0. However, for low values of CH 4 and N 2 0 
there is quite a spread in the relationship which a single curve can not describe. This 
is the altitude dependent regime where the correlation shows significant variation with 
altitude (Minsehwaner et al., 1 996). 

Figure 1 c shows a more conventional fit using a Chebyshev polynomial of order 20. 
This fit was chosen as giving the best agreement to the CH 4 -N 2 0 correlation after 
performing fits using 3667 different equations. Even though this is a good fit the spread 
of values can not be described by a single curve. However, a neural network trained 
with the latitude, pressure, time of year, and CH 4 volume mixing ratio (v.m.r.) (four 
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inputs) is able to well reproduce the N 2 0 v.m.r. (one output), including the spread for 
low values of CH 4 and N 2 0. 

3.2 Scaling 

Variable scaling often allows neural networks to achieve better results. In this case all 
variables were scaled to vary between zero and one. If the initial range of values was 
more than an order of magnitude then log scaling was also applied. In the case of time 
of year the sine of the fractional time of year was used to avoid a step discontinuity at 
the start of the year. 

4 Conclusions 

Neural networks are ideally suited to describe the spatial and temporal dependence 
of tracer-tracer correlations. Even in regions when the correlations are less compact. 
Useful insight can be gained into the relative roles of the input variables from visualizing 
the network weight assignment. The neural network Fortran code used is available for 
download. 
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Quickprop NN with 1 layer and 8 nodes 



Fig. 1. The neural network used to produce the CH 4 -N 2 0 correlation in Panel (a) used Quick- 
prop learning and one hidden layer with eight nodes. The correlation coefficient between the 
actual solution and the neural network solution was 0.9995. Panel (b) shows how the median 
fractional error of the neural network decreases with epoch (iteration). Both CH 4 and pressure 
are strongly correlated with N 2 0 as can be seen in panels (c) and (d). Latitude and time are 
only weakly correlated with N 2 0 as can be seen in panels (e) and (f). Even though the correla- 
tion with time of year and latitude is relatively weak it still does play a role in capturing some of 
the details of the CH 4 -N 2 0 correlation in Panel (a). 
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Fractional Error for Quickprop NN with 1 layer and 8 nodes 



Fig. 1. (b) Continued. 
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Latitude - N 2 0 



Fig. 1. (e) Continued. 
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Time of Year - f^O 



Fig. 1 . (f) Continued. 
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